[Azure Logs] Add dedicated aadgraphactivitylogs data stream#18880
Conversation
Adds the azure.aadgraphactivitylogs data stream to ingest the AzureADGraphActivityLogs diagnostic category from Microsoft Entra ID, parallel to azure.graphactivitylogs for Microsoft Graph. Without this, AAD Graph events fall through to azure.platformlogs and the AAD-Graph-specific properties survive only inside event.original.
This comment has been minimized.
This comment has been minimized.
efd6
left a comment
There was a problem hiding this comment.
Please update the proposed commit message so that it's something that can be used in context of git; no Markdown, appropriately wrapped etc.
For example based on the current code here (update as needed):
azure: add aadgraphactivitylogs data stream
Add a dedicated data stream for the AzureADGraphActivityLogs
diagnostic category from Microsoft Entra ID. Without this,
legacy Azure AD Graph (graph.windows.net) events fall through
to the platformlogs catch-all and lose schema-aware parsing.
The events router maps routing.category ==
"AzureADGraphActivityLogs" to the new dataset. The ingest
pipeline extracts ECS fields: event.action from HTTP method +
URI collection, event.outcome from response status,
event.category [iam, web], and related.user including the
OAuth app_id for client correlation.
Legacy AAD Graph is still actively used by Microsoft first-party
tooling, older line-of-business apps, and adversary tooling
(ROADtools, AzureHound v1, AADInternals). The dedicated dataset
makes these events available for detection rules and dashboards.
|
@efd6 proposed commit message updated. Thank you! |
|
The build is complaining:
I think you will need to add a line before this. Who will be the owner of this data stream? |
Yes, I noticed the buildkite fail related to owners. There are a few owners across the Azure package data streams it seems so I am not sure what team should be the owner/maintainer? I assume since these are the legacy data stream for Microsoft Graph and it was provisioned to write threat detection rules on, we mirror that so @elastic/security-service-integrations? |
This comment has been minimized.
This comment has been minimized.
🚀 Benchmarks reportTo see the full report comment with |
There was a problem hiding this comment.
This pipeline does match about half the other pipelines with the same name:
find -name 'azure-shared-pipeline.yml' | xargs md5sum | sort
28624170d9ba87d593c9aef7dd72284a ./data_stream/application_gateway/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
28624170d9ba87d593c9aef7dd72284a ./data_stream/firewall_logs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
352b4a0232fcf818b45a174958b161e8 ./data_stream/identity_protection/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
352b4a0232fcf818b45a174958b161e8 ./data_stream/provisioning/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
8fb30aa0822189b990f17ba026aeb928 ./data_stream/platformlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
d9d42c87fb050f4264f140fe5c4fddd0 ./data_stream/aadgraphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
d9d42c87fb050f4264f140fe5c4fddd0 ./data_stream/activitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
d9d42c87fb050f4264f140fe5c4fddd0 ./data_stream/auditlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
d9d42c87fb050f4264f140fe5c4fddd0 ./data_stream/eventhub/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
d9d42c87fb050f4264f140fe5c4fddd0 ./data_stream/graphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
d9d42c87fb050f4264f140fe5c4fddd0 ./data_stream/signinlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
d9d42c87fb050f4264f140fe5c4fddd0 ./data_stream/springcloudlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
It's a good time to use the new links functionality, at least for the ones that do still match.
There was a problem hiding this comment.
Added per recommendations: 7be1278fcf4bcd9e2723782c95c43f5a55f45a2d
Note, I did not adjust all of them that could be as that seems out of scope for this. Maybe a separate issue/pr for this?
There was a problem hiding this comment.
It's a step in the right direction. I think it would be good to move that version of the file to the top level and change the identical copies to links. Could be done in this PR.
# place the shared copy at the top level
mkdir -p ./_dev/shared/
mv ./data_stream/graphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml ./_dev/shared/
# update link to the shared copy
echo "../../../../_dev/shared/azure-shared-pipeline.yml d9410a8b01f785a2e328560cc5d9a2286c8ca5b4e6ab5ec0edfaba385ddb0fb9" > \
./data_stream/aadgraphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link
# remove other hard copies
rm ./data_stream/activitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
rm ./data_stream/auditlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
rm ./data_stream/eventhub/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
rm ./data_stream/signinlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
rm ./data_stream/springcloudlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml
# add other links to the shared copy
cp ./data_stream/aadgraphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link ./data_stream/activitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link
cp ./data_stream/aadgraphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link ./data_stream/auditlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link
cp ./data_stream/aadgraphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link ./data_stream/eventhub/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link
cp ./data_stream/aadgraphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link ./data_stream/graphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link
cp ./data_stream/aadgraphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link ./data_stream/signinlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link
cp ./data_stream/aadgraphactivitylogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link ./data_stream/springcloudlogs/elasticsearch/ingest_pipeline/azure-shared-pipeline.yml.link
TL;DRBuildkite failed before tests ran because the PR checkout hook hit a GitHub remote Remediation
Investigation detailsRoot CauseThe failing step is Relevant code path:
The PR changes are focused on Azure package files/CODEOWNERS and do not modify Buildkite hooks, which supports this being infra/transient rather than a PR logic/config bug. Evidence
Verification
Follow-upIf the retry succeeds, no PR code change is needed. If repeated 500s continue across builds, treat as persistent CI infrastructure issue and add fetch retry logic in the hook. What is this? | From workflow: PR Buildkite Detective Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
✅ Vale Linting ResultsNo issues found on modified lines! The Vale linter checks documentation changes against the Elastic Docs style guide. To use Vale locally or report issues, refer to Elastic style guide for Vale. |
…e as comment in ingest pipeline
…after Microsoft Graph Activity Logs in the supported-services list.
|
@chrisberkhout - Thank you again for the feedback and guidance. All comments have been addressed. For each comment, I added the commit for easier reference. The latest buildkit build has succeeded. Additionally, I've rebuilt locally a new stack and ingested AAD Graph Activity Logs to verify logs are as expected and integration updates are accurate. Please let me know if if there is anything else to add/adjust. |
| tag: pipeline-azure-shared-pipeline | ||
| - fingerprint: | ||
| fields: | ||
| - azure.aadgraphactivitylogs.properties.request_uri |
There was a problem hiding this comment.
The pipeline removes the azure.aadgraphactivitylogs.properties.request_uri field a few lines before if url.original is not null. Should we keep it or replace this with a different field?
There was a problem hiding this comment.
@zmoog Good catch and thank you for all the review! If I understand correctly, the remove a few lines up was stripping properties.request_uri before the fingerprint got to use it, so that input was silently dropping out (ignore_missing: true masked it). The hash was effectively only using http.request.id, azure.tenant_id, and properties.time_generated and confirmed this against the data.
Swapped it to use url.original instead, which holds the same value (the uri_parts step keeps a copy via keep_original: true) and is still around at fingerprint time. Re-streamed live events through twice end-to-end and confirmed the _ids are stable now.
…ingest_pipeline/default.yml Co-authored-by: Maurizio Branca <maurizio.branca@elastic.co>
…ingest_pipeline/default.yml Co-authored-by: Maurizio Branca <maurizio.branca@elastic.co>
chrisberkhout
left a comment
There was a problem hiding this comment.
Thanks. Looks good.
I gave a script to adopt links for the other uses of the same version of the ingest pipeline.
💚 Build Succeeded
History
|
|
Package azure - 1.37.0 containing this change is available at https://epr.elastic.co/package/azure/1.37.0/ |
Adds the
azure.aadgraphactivitylogsdata stream to ingest theAzureADGraphActivityLogsdiagnostic category from Microsoft Entra ID, parallel toazure.graphactivitylogsfor Microsoft Graph. Without this, AAD Graph events fall through toazure.platformlogsand the AAD-Graph-specific properties survive only insideevent.original.Proposed commit message
Checklist
changelog.ymlfile.Author's Checklist
elastic-package checkpasses;elastic-package test pipeline -d aadgraphactivitylogspasses.logs-azure.events-defaultare correctly rerouted tologs-azure.aadgraphactivitylogs-defaultwith full ECS field extraction.How to test this PR locally
Optional end-to-end:
https://localhost:5601(user/pass)logs-azure.aadgraphactivitylogs-*event.action,event.outcome,http.*,url.path,azure.aadgraphactivitylogs.properties.*, andrelated.userall populate.Related issues
Screenshots
Pipeline tests passing locally
Discover view of the new dataset with ECS-parsed events
Expanded document showing the full ECS field tree